Programming in Icon A Simple Search Program (C) 1987 by Larry Phillips I decided to write a search program that mimics the Amiga supplied SEARCH to see how Icon compared to other languages. The minimal search, one that does not print out the line numbers, and which is case sensitive, is as follows: procedure main(args) in := open (args[1]) | stop("Can't open input file") pattern := args[2] while line := read(in) do if find (pattern,line) then write(line) else next end A few words of explanation are in order. Arguments are passed as a list of strings. args[1] is the first item in the list, and is the file name. The line that opens the file assigns the file handle to the variable 'in', if successful. If not successful, the program will stop with a message, "Can't open input file." An "english" reading of the line would be "Either open the file or stop and print a message". Note that even though I opened the file, I have not closed it anywhere. This is because all files are closed when the program terminates. If you are working with a large number of files, it is good practice to close them as they become unnecessary. The second element of the argument list is the search string. I have added the extra assignment of the search string to the variable 'pattern' for clarity only, and could as easily have used 'args[2]' later on in the 'if find..' statement. The 'while..' statement will read lines from the specified file until either EOF or an error condition is encountered, and will assign the value read to the variable 'line'. The while statement operates much like the while in C, in that it will execute the next line until the terminating condition is met. Note that the 'if' statement following the 'while' includes the 'write', which is on the next line for clarity only. It could have been written as: if find (pattern,line) then write (line) The 'find' function returns the position of one string within another. In this case we are only interested in the success or failure of the 'find' statement, and if successful, we write the line to the screen. If it fails, the 'else' statement executes and the 'next' takes us back, continuing with the 'while' statement. The 'end' tells us that this is the end of the procedure. Since we might be interested in the line number in which any matches occur, and since we want to make the search case insensitive, I modified the program to handle these criteria. Now bear in mind that I am strictly a novice Icon programmer, and that there are undoubtedly better ways to do this. The modified code looks like this: procedure main(args) i := 0 in := open (args[1]) | stop("Can't open input file") pattern := args[2] while line := read(in) do { i +:= 1 if find (pattern,map(line,&ucase,&lcase)) then write(i || " " || line) else next } end Notice that that we are using a variable to count lines. The use of the variable 'i' is straightforward, but notice that we now have two statements to perform within the 'while' control block. The curly brace is used to logically group the lines as part of the 'while' statement. It's exactly the same as in a C program. The case insensitivity is provided by modifying the 'find'. The 'map' function will cause any characters in 'line' that appear in the second string to be replaced with the corresponding characters in the third string. As an example, the statement: write map("hello there","e","-") will cause the string "h-llo th-r-" to be written to the screen. The mapping strings in this case are special strings called 'csets', which are sets of unique characters. An example of a cset would be the set of vowels, and can be generated with the statement: vowel := 'aeiou' The exact same set can be made with either: vowel := 'uoiea' or vowel := 'aaaeeeiiiooouuu' Since the characters are unique, and since the idea of order in a cset in meaningless, the same cset is produced in all three cases. In the example program, there are two csets used, &lcase and &ucase. These are "built in" csets that represent the set of all lower case characters and the set of all upper case characters, respectively. The function 'map(line,&ucase,&lcase)' will return a string that consists of whatever is contained in the variable 'line', but with all the upper case characters mapped to lower case. The 'find' function then, will perform its function on the mapped string. To be truly general, and to allow the search string to be in mixed case, 'map' should be applied to the variable 'pattern'. This is best done outside the loop to save time, since it nees to be done only once. To implement this, we would change the line 'pattern := args[2]' to read 'pattern := map (args[2],&ucase,&lcase)'. The last change is in the line we write to the screen when the find is successful. The '||' operator is the operator to concatenate strings. Note that due to Icon's automatic type conversion, we do not have to explicitly convert the variable 'i', which we have been using as a numeric counter, into a string. The type is converted for us, and the resulting line will consist of the line number followed by a space, followed by the line itself. So, that's it. Now for the burning questions. How big is the resulting code, and how is the speed? How long does it take to compile? Icon is not a true "compiled" langauge, nor is it a true "iterpreted" language. The size of the resulting program is only 3484 bytes, but when executed, the run-time module (iconx) must be loaded in, and it is nearly 100K. On a 1.5 meg system, with Icon kept in RAM:, the compile speed is lightning fast, taking (for this program) just over 2 seconds. The speed is surprising for such a high level language. I have a file I search on a regular basis that is about 120K. For the test, I used a stopwatch, and copied the Amigados SEARCH command into RAM: so that load times would not be dependent upon device speed. The results were: Amigados SEARCH -- about 51 seconds Icon search -- about 39 seconds Note that the Icon search command, if you had it on disk alnong with 'iconx', would have taken a longer to load, and the results would have been a lot closer. I feel that as a high level language, Icon is a very respectable language for "quick and dirty" or "one time" programs, especially in the area of text processing. The price is right (it's PD), and it's relatively easy to learn.